Create a new file named cogs2020_hw_7.R and save it in the cogs2020 folder that you created during Lecture 1.
If you have not already installed the data.table and the ggplot2 packages, do so now. If you use the install.packages function, do not include it in the script that you hand in.
Make the first line of your script library(data.table) and the second line library(ggplot2).
Make the second line of your script rm(list=ls()). This line of code will erase any variable defined before it. Do not put this line of code after anything important that you want to keep (e.g., so that I can mark your work).
After rm(list=ls()), create a variable named my_name and set its value equal to a character vector (i.e., letters surrounded by "" or by '') containing your name.
After rm(list=ls()), create a variable named my_student_id and set its value equal to a character vector (i.e., letters surrounded by "" or by '') containing your student id.
So far, here is what your file should look like if your name is John Doe and and your student ID is 12345678. Note that the following code chunk also includes a few lines to create a data.table named d that many of the problems below require you manipulate.
# load the packages we will need
library(data.table)
library(ggplot2)
# clean session
rm(list=ls())
# basic id info
my_name <- "John Doe"
my_student_id <- "12345678"
# Please include the following line of code as well
set.seed(0)
This homework requires you to use the stringr package, so please install it usin whatever method you like.
Do NOT include install.packages in the homework script you submit.
After you get stringr installed, load it using library(stringr).
The top of your script should now look analgous to the following:
# Basic setup code that you will need for this problem
library(data.table)
library(ggplot2)
library(stringr)
rm(list=ls())
# basic id info
my_name <- "John Doe"
my_student_id <- "12345678"
# Please include the following line of code as well
set.seed(0)
This homework relies on magnetoencephalography (MEG) data collected from a single participant while they performed a category learning experiment. On each trial of the category learning experiment, the participant viewed a circular sine wave grating, and had to push a button to indicate whether they believed the stimulus belonged to category A or category B. We have seen and worked with this type of category learning experiment many times throughout this course, and it is further described by the following figure.
MEG is used to record the time-series of magnetic and electric potentials at the scalp, which are generated by the activity of neurons. There are many sensors, each configured to pick up signal from a different position on the scalp. This is shown in the following figure (the text labels indicate the channel name and are placed approximately where the MEG sensor is located on a real head).
The data file that we will be working with is arranged into epochs aligned to stimulus presentation. This means that every time a stimulus is presented we say that an epoch has occurred. We then assign a time of \(t=0\) to the exact moment the stimulus appeared. We then typically look at the neural time series from just before the stimulus appeared to a little while after the stimulus has appeared. For this data, each epoch starts 0.1 seconds before stimulus onset, and concludes 0.3 seconds after stimulus onset. The following figure shows the MEG signal at every sensory location across the entire scalp for 5 time points within this \([-0.1s, 0.3s]\) interval.
data.table and the columns renamed to eliinate spaces by using the following code:d <- fread('https://crossley.github.io/cogs2020/data/eeg/epochs.txt')
# The column names that come from this file have spaces
# This line removes those spaces (depends on the `stringr` package)
names(d) <- str_replace_all(names(d), c(" " = "." , "," = "" ))
The time column contains times in seconds relative to stimulus onset. Stimulus onset always occurs at \(0\) seconds.
The condition column indicates which category the stimulus belonged to for the given epoch. We won’t make use of this column here, and we will remove it below.
The epoch column is the epoch number. You can think of this like we have usually thought of trial columns in examples throughout the course.
The many different MEG xyz columns contain the actual neural time series signals for each sensor. See the above figure for how these column names map onto scalp positions.
The time column contains times in seconds relative to stimulus onset. Stimulus onset always occurs at \(0\) seconds.
The condition column indicates which category the stimulus belonged to for the given epoch. We won’t make use of this column here, and we will remove it below.
The epoch column is the epoch number. You can think of this like we have usually thought of trial columns in examples throughout the course.
The many different MEG xyz columns contain the actual neural time series signals for each sensor. See the above figure for how these column names map onto scalp positions.
Consider two random variables \(X \sim \mathcal{N}(\mu_X, \sigma_X)\) and \(Y \sim \mathcal{N}(\mu_Y, \sigma_Y)\). Let \(X\) generate data for MEG channel 133 and \(Y\) generate data for MEG channel 135. Test the hypothesis that the mean MEG signal for \(t > 0\) in these two channels are significantly different. When computing the mean MEG signal, keep epochs separate and average over everything else. You should be left with one observation per epoch. Assume that \(\sigma_X = \sigma_Y\) and also assume that \(X\) and \(Y\) are independent.
Store the observed \(t\) value of this test in a variable named ans_1_t_test_stat_obs.
Store the lower critical value in a variable named ans_1_critical_value_lower.
Store the upper critical value in a variable namedans_1_critical_value_upper.
Store the observed \(95\%\) CI lower value in a variable namedans_1_CI_lower
Store the observed \(95\%\) CI upper value in a variable namedans_1_CI_upper
Store the observed \(p\)-value in a variable namedans_1_p_value
Consider two random variables \(X \sim \mathcal{N}(\mu_X, \sigma_X)\) and \(Y \sim \mathcal{N}(\mu_Y, \sigma_Y)\). Let \(X\) generate data for MEG channel 039 during the first 30 epochs and \(Y\) generate data for MEG channel 039 during the remaining epochs. Test the hypothesis that the mean MEG signal for \(t > 0\) in these two signals are significantly different. Assume \(X\) and \(Y\) are independent but do not assume that \(\sigma_X=\sigma_Y\).
Store the observed \(t\) value of this test in a variable named ans_2_t_test_stat_obs.
Store the lower critical value in a variable named ans_2_critical_value_lower.
Store the upper critical value in a variable namedans_2_critical_value_upper.
Store the observed \(95\%\) CI lower value in a variable namedans_2_CI_lower
Store the observed \(95\%\) CI upper value in a variable namedans_2_CI_upper
Store the observed \(p\)-value in a variable namedans_2_p_value
Do you think two different MEG channels on the same persons head are likely to be independent? Explain your reasoning in a brief comment (no more than a sentence or two).
Consider two random variables \(X \sim \mathcal{N}(\mu_X, \sigma_X)\) and \(Y \sim \mathcal{N}(\mu_Y, \sigma_Y)\). Let \(X\) generate data for MEG channel 039 and \(Y\) generate data for MEG channel 135. Test the hypothesis that the mean MEG signal per epoch for \(t > 0\) in these two channels are significantly different. Do not assume \(X\) and \(Y\) are independent.
Store the observed \(t\) value of this test in a variable named ans_4_t_test_stat_obs.
Store the lower critical value in a variable named ans_4_critical_value_lower.
Store the upper critical value in a variable namedans_4_critical_value_upper.
Store the observed \(95\%\) CI lower value in a variable namedans_4_CI_lower
Store the observed \(95\%\) CI upper value in a variable namedans_4_CI_upper
Store the observed \(p\)-value in a variable namedans_4_p_value
A common error on these problem sets is to accidentally erase or overwrite variables from one question with those from another. Another common error is to name your variables slightly wrong (e.g., Ans_1a instead of ans_1a). Yet another common error is to include a line of code or two that generates an error, and sometimes this error is serious enough to prevent most or all of your script from running, in which case you will lose most or all of your marks for that assignment. All of these can be a very frustrating way to lose marks, but you will indeed lose marks if you make these mistakes, so you need to be very careful!
To ensure that this doesn’t happen to you, please run your entire .R script from start to finish. One way to do this is to use Code > Run Region > Run All, but of course there are shortcuts for everything so do as you wish. If there are any errors at all when you do this, it is essential that you address them.
After all of your code has executed from start to finish without and errors at all, then carefully inspect your workspace to ensure that the following variables are defined. A reasonable way to do this is to try to print each one of these variables to the console. If it is defined, then it will print without error.
my_name
my_student_id
ans_1_t_test_stat_obs
ans_1_critical_value_lower
ans_1_critical_value_upper
ans_1_CI_lower
ans_1_CI_upper
ans_1_p_value
ans_2_t_test_stat_obs
ans_2_critical_value_lower
ans_2_critical_value_upper
ans_2_CI_lower
ans_2_CI_upper
ans_2_p_value
ans_4_t_test_stat_obs
ans_4_critical_value_lower
ans_4_critical_value_upper
ans_4_CI_lower
ans_4_CI_upper
ans_4_p_value
Finally, be sure that the file you submit to iLearn is a .R file and nothing else. Any other extension cannot be marked. This means do not submit .Rmd, .RProj, .pdf, .html or anything else. You must submit a .R file.